2)), _add_batch_dim(torch.ones(2,2)), _add_batch_dim(torch.ones(2,2,2))) show_as_row(_add_batch_dim(torch.ones(
#0
tensor([[[1.], [1.]]])
#1
tensor([[[1., 1.], [1., 1.]]])
#2
tensor([[[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]]])
The models uses a latent state variable \(x\) that is modelled over time, to impute gaps in \(y\)
The assumption of the model is that the state variable at time \(x_t\) depends only on the last state \(x_{t-1}\) and not on the previous states.
The equations of the model are:
\[\begin{align} p(x_t | x_{t-1}) & = \mathcal{N}(Ax_{t-1} + b, Q) \\ p(y_t | x_t) & = \mathcal{N}(Hx_t + d, R) \end{align}\]
where:
A
bset
Q
obs_trans
d
R
in addition the model has also the parameters of the initial state that are used to initialize the filter:
m0
P0
The Kalman filter has 3 steps:
In case of missing data the update step is skipped.
After smoothing the missing data at time t (\(y_t\)) can be imputed from the state (\(x_t\)) using this formula: \[p(y_t|x_t) = \mathcal{N}(Hx_t + d, R + HP^s_tH^T)\]
The Kalman Filter is an algorithm designed to estimate \(P(x_t | y_{0:t})\). As all state transitions and obss are linear with Gaussian distributed noise, these distributions can be represented exactly as Gaussian distributions with mean ms[t]
and covs Ps[t]
. Similarly, the Kalman Smoother is an algorithm designed to estimate \(P(x_t | y_{0:t-1})\)
KalmanFilterBase (A:torch.Tensor, H:torch.Tensor, B:torch.Tensor, Q:torch.Tensor, R:torch.Tensor, b:torch.Tensor, d:torch.Tensor, m0:torch.Tensor, P0:torch.Tensor, n_dim_state:int=None, n_dim_obs:int=None, n_dim_contr:int=None, var_names:Optional[Iterable[str]]=None, contr_names:Optional[Iterable[str]]=None, cov_checker:meteo_imp.gaussian.CheckPosDef|None=None, use_conditional:bool=False, use_control:bool=True, use_smooth:bool=True, pred_only_gap:bool=False, pred_std:bool=False)
Base class for handling Kalman Filter implementation in PyTorch
Type | Default | Details | |
---|---|---|---|
A | Tensor | [n_dim_state,n_dim_state] \(A\), state transition matrix | |
H | Tensor | [n_dim_obs, n_dim_state] \(H\), observation matrix | |
B | Tensor | [n_dim_state, n_dim_contr] \(B\) control matrix | |
Q | Tensor | [n_dim_state, n_dim_state] \(Q\), state trans covariance matrix | |
R | Tensor | [n_dim_obs, n_dim_obs] \(R\), observations covariance matrix | |
b | Tensor | [n_dim_state] \(b\), state transition offset | |
d | Tensor | [n_dim_obs] \(d\), observations offset | |
m0 | Tensor | [n_dim_state] \(m_0\) | |
P0 | Tensor | [n_dim_state, n_dim_state] \(P_0\) | |
n_dim_state | int | None | Number of dimensions for state - default infered from parameters |
n_dim_obs | int | None | Number of dimensions for observations - default infered from parameters |
n_dim_contr | int | None | Number of dimensions for control - default infered from parameters |
var_names | Optional | None | Names of variables for printing |
contr_names | Optional | None | Names of control variables for printing |
cov_checker | meteo_imp.gaussian.CheckPosDef | None | None | Check covariance at every step |
use_conditional | bool | False | Use conditional distribution for gaps that don’t have all variables missing |
use_control | bool | True | Use the control in the filter |
use_smooth | bool | True | Use smoother for predictions (otherwise is filter only) |
pred_only_gap | bool | False | it True predictions are only for the gap |
pred_std | bool | False | return only stds and not covariances |
Giving all the parameters manually to the KalmanFilterBase
init method is not convenient, hence we are having some methods that help initize the class
due to a bug in fastcore cannot subclass after creating class methods
KalmanFilter (A:torch.Tensor, H:torch.Tensor, B:torch.Tensor, Q:torch.Tensor, R:torch.Tensor, b:torch.Tensor, d:torch.Tensor, m0:torch.Tensor, P0:torch.Tensor, n_dim_state:int=None, n_dim_obs:int=None, n_dim_contr:int=None, var_names:Optional[Iterable[str]]=None, contr_names:Optional[Iterable[str]]=None, cov_checker:meteo_imp.gaussian.CheckPosDef|None=None, use_conditional:bool=False, use_control:bool=True, use_smooth:bool=True, pred_only_gap:bool=False, pred_std:bool=False)
Base class for handling Kalman Filter implementation in PyTorch
Type | Default | Details | |
---|---|---|---|
A | Tensor | [n_dim_state,n_dim_state] \(A\), state transition matrix | |
H | Tensor | [n_dim_obs, n_dim_state] \(H\), observation matrix | |
B | Tensor | [n_dim_state, n_dim_contr] \(B\) control matrix | |
Q | Tensor | [n_dim_state, n_dim_state] \(Q\), state trans covariance matrix | |
R | Tensor | [n_dim_obs, n_dim_obs] \(R\), observations covariance matrix | |
b | Tensor | [n_dim_state] \(b\), state transition offset | |
d | Tensor | [n_dim_obs] \(d\), observations offset | |
m0 | Tensor | [n_dim_state] \(m_0\) | |
P0 | Tensor | [n_dim_state, n_dim_state] \(P_0\) | |
n_dim_state | int | None | Number of dimensions for state - default infered from parameters |
n_dim_obs | int | None | Number of dimensions for observations - default infered from parameters |
n_dim_contr | int | None | Number of dimensions for control - default infered from parameters |
var_names | Optional | None | Names of variables for printing |
contr_names | Optional | None | Names of control variables for printing |
cov_checker | meteo_imp.gaussian.CheckPosDef | None | None | Check covariance at every step |
use_conditional | bool | False | Use conditional distribution for gaps that don’t have all variables missing |
use_control | bool | True | Use the control in the filter |
use_smooth | bool | True | Use smoother for predictions (otherwise is filter only) |
pred_only_gap | bool | False | it True predictions are only for the gap |
pred_std | bool | False | return only stds and not covariances |
KalmanFilterSR (A:torch.Tensor, H:torch.Tensor, B:torch.Tensor, Q:torch.Tensor, R:torch.Tensor, b:torch.Tensor, d:torch.Tensor, m0:torch.Tensor, P0:torch.Tensor, n_dim_state:int=None, n_dim_obs:int=None, n_dim_contr:int=None, var_names:Optional[Iterable[str]]=None, contr_names:Optional[Iterable[str]]=None, cov_checker:meteo_imp.gaussian.CheckPosDef|None=None, use_conditional:bool=False, use_control:bool=True, use_smooth:bool=True, pred_only_gap:bool=False, pred_std:bool=False)
Base class for handling Kalman Filter implementation in PyTorch
Type | Default | Details | |
---|---|---|---|
A | Tensor | [n_dim_state,n_dim_state] \(A\), state transition matrix | |
H | Tensor | [n_dim_obs, n_dim_state] \(H\), observation matrix | |
B | Tensor | [n_dim_state, n_dim_contr] \(B\) control matrix | |
Q | Tensor | [n_dim_state, n_dim_state] \(Q\), state trans covariance matrix | |
R | Tensor | [n_dim_obs, n_dim_obs] \(R\), observations covariance matrix | |
b | Tensor | [n_dim_state] \(b\), state transition offset | |
d | Tensor | [n_dim_obs] \(d\), observations offset | |
m0 | Tensor | [n_dim_state] \(m_0\) | |
P0 | Tensor | [n_dim_state, n_dim_state] \(P_0\) | |
n_dim_state | int | None | Number of dimensions for state - default infered from parameters |
n_dim_obs | int | None | Number of dimensions for observations - default infered from parameters |
n_dim_contr | int | None | Number of dimensions for control - default infered from parameters |
var_names | Optional | None | Names of variables for printing |
contr_names | Optional | None | Names of control variables for printing |
cov_checker | meteo_imp.gaussian.CheckPosDef | None | None | Check covariance at every step |
use_conditional | bool | False | Use conditional distribution for gaps that don’t have all variables missing |
use_control | bool | True | Use the control in the filter |
use_smooth | bool | True | Use smoother for predictions (otherwise is filter only) |
pred_only_gap | bool | False | it True predictions are only for the gap |
pred_std | bool | False | return only stds and not covariances |
Kalman Filter
N dim obs: 3,
N dim state: 4,
N dim contr: 3
tensor([[[1.5725, 0.3829, 0.0843, 0.3637],
[0.3829, 1.4430, 0.3358, 1.1988],
[0.0843, 0.3358, 1.7816, 0.7411],
[0.3637, 1.1988, 0.7411, 1.6773]]], dtype=torch.float64,
grad_fn=<UnsafeViewBackward0>)
tensor([[[1.2540, 0.0000, 0.0000, 0.0000],
[0.3053, 1.1618, 0.0000, 0.0000],
[0.0672, 0.2714, 1.3052, 0.0000],
[0.2901, 0.9556, 0.3542, 0.7446]]], dtype=torch.float64,
grad_fn=<DiagonalScatterBackward0>)
check that assigment works :)
tensor([[0.9349, 0.0000, 0.0000, 0.0000],
[0.3928, 1.0748, 0.0000, 0.0000],
[0.7406, 0.7533, 0.8326, 0.0000],
[0.5903, 0.0391, 0.8217, 0.9638]], dtype=torch.float64,
grad_fn=<DiagonalScatterBackward0>)
Kalman Filter
N dim obs: 3,
N dim state: 4,
N dim contr: 3
[('A',
Parameter containing:
tensor([[[0.1751, 0.5375, 0.7676, 0.7450],
[0.1204, 0.4777, 0.5823, 0.3786],
[0.8484, 0.2317, 0.3969, 0.7088],
[0.0270, 0.6178, 0.6097, 0.9128]]], dtype=torch.float64,
requires_grad=True)),
('H',
Parameter containing:
tensor([[[0.4984, 0.1597, 0.4458, 0.7349],
[0.1070, 0.9531, 0.1942, 0.6683],
[0.9186, 0.7123, 0.1806, 0.5045]]], dtype=torch.float64,
requires_grad=True)),
('B',
Parameter containing:
tensor([[[0.2827, 0.1138, 0.9378],
[0.5594, 0.9364, 0.5136],
[0.8592, 0.7647, 0.5183],
[0.2376, 0.5618, 0.5096]]], dtype=torch.float64, requires_grad=True)),
('Q_raw',
Parameter containing:
tensor([[[0.4590, 0.0000, 0.0000, 0.0000],
[0.5348, 0.5296, 0.0000, 0.0000],
[0.0025, 0.0749, 0.4603, 0.0000],
[0.0588, 0.5416, 0.9309, 0.5859]]], dtype=torch.float64,
requires_grad=True)),
('R_raw',
Parameter containing:
tensor([[[0.1393, 0.0000, 0.0000],
[0.0249, 0.7613, 0.0000],
[0.7829, 0.4311, 0.9199]]], dtype=torch.float64, requires_grad=True)),
('b',
Parameter containing:
tensor([[[0.4661],
[0.3918],
[0.0571],
[0.9529]]], dtype=torch.float64, requires_grad=True)),
('d',
Parameter containing:
tensor([[[0.9334],
[0.5645],
[0.0695]]], dtype=torch.float64, requires_grad=True)),
('m0',
Parameter containing:
tensor([[[0.8234],
[0.3850],
[0.3380],
[0.4376]]], dtype=torch.float64, requires_grad=True)),
('P0_raw',
Parameter containing:
tensor([[[0.4467, 0.0000, 0.0000, 0.0000],
[0.4818, 0.9502, 0.0000, 0.0000],
[0.4770, 0.3478, 0.9812, 0.0000],
[0.2460, 0.9602, 0.6476, 0.2647]]], dtype=torch.float64,
requires_grad=True))]
OrderedDict([('A',
tensor([[[0.1751, 0.5375, 0.7676, 0.7450],
[0.1204, 0.4777, 0.5823, 0.3786],
[0.8484, 0.2317, 0.3969, 0.7088],
[0.0270, 0.6178, 0.6097, 0.9128]]], dtype=torch.float64)),
('H',
tensor([[[0.4984, 0.1597, 0.4458, 0.7349],
[0.1070, 0.9531, 0.1942, 0.6683],
[0.9186, 0.7123, 0.1806, 0.5045]]], dtype=torch.float64)),
('B',
tensor([[[0.2827, 0.1138, 0.9378],
[0.5594, 0.9364, 0.5136],
[0.8592, 0.7647, 0.5183],
[0.2376, 0.5618, 0.5096]]], dtype=torch.float64)),
('Q_raw',
tensor([[[0.4590, 0.0000, 0.0000, 0.0000],
[0.5348, 0.5296, 0.0000, 0.0000],
[0.0025, 0.0749, 0.4603, 0.0000],
[0.0588, 0.5416, 0.9309, 0.5859]]], dtype=torch.float64)),
('R_raw',
tensor([[[0.1393, 0.0000, 0.0000],
[0.0249, 0.7613, 0.0000],
[0.7829, 0.4311, 0.9199]]], dtype=torch.float64)),
('b',
tensor([[[0.4661],
[0.3918],
[0.0571],
[0.9529]]], dtype=torch.float64)),
('d',
tensor([[[0.9334],
[0.5645],
[0.0695]]], dtype=torch.float64)),
('m0',
tensor([[[0.8234],
[0.3850],
[0.3380],
[0.4376]]], dtype=torch.float64)),
('P0_raw',
tensor([[[0.4467, 0.0000, 0.0000, 0.0000],
[0.4818, 0.9502, 0.0000, 0.0000],
[0.4770, 0.3478, 0.9812, 0.0000],
[0.2460, 0.9602, 0.6476, 0.2647]]], dtype=torch.float64))])
Built-in mutable sequence.
If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.
KalmanFilterBase.get_info ()
Kalman Filter (3 obs, 4 state, 3 contr)
$A$
state | x_0 | x_1 | x_2 | x_3 |
---|---|---|---|---|
x_0 | 0.1751 | 0.5375 | 0.7676 | 0.7450 |
x_1 | 0.1204 | 0.4777 | 0.5823 | 0.3786 |
x_2 | 0.8484 | 0.2317 | 0.3969 | 0.7088 |
x_3 | 0.0270 | 0.6178 | 0.6097 | 0.9128 |
$Q$
state | x_0 | x_1 | x_2 | x_3 |
---|---|---|---|---|
x_0 | 0.9002 | 0.5074 | 0.0024 | 0.0558 |
x_1 | 0.5074 | 1.2712 | 0.0756 | 0.5690 |
x_2 | 0.0024 | 0.0756 | 0.9072 | 0.9246 |
x_3 | 0.0558 | 0.5690 | 0.9246 | 2.2208 |
$b$
state | offset |
---|---|
x_0 | 0.4661 |
x_1 | 0.3918 |
x_2 | 0.0571 |
x_3 | 0.9529 |
$H$
variable | x_0 | x_1 | x_2 | x_3 |
---|---|---|---|---|
y_0 | 0.4984 | 0.1597 | 0.4458 | 0.7349 |
y_1 | 0.1070 | 0.9531 | 0.1942 | 0.6683 |
y_2 | 0.9186 | 0.7123 | 0.1806 | 0.5045 |
$R$
variable | y_0 | y_1 | y_2 |
---|---|---|---|
y_0 | 0.5856 | 0.0190 | 0.5991 |
y_1 | 0.0190 | 1.3107 | 0.5130 |
y_2 | 0.5991 | 0.5130 | 2.3748 |
$d$
variable | offset |
---|---|
y_0 | 0.9334 |
y_1 | 0.5645 |
y_2 | 0.0695 |
$B$
state | c_0 | c_1 | c_2 |
---|---|---|---|
x_0 | 0.2827 | 0.1138 | 0.9378 |
x_1 | 0.5594 | 0.9364 | 0.5136 |
x_2 | 0.8592 | 0.7647 | 0.5183 |
x_3 | 0.2376 | 0.5618 | 0.5096 |
$m_0$
state | mean |
---|---|
x_0 | 0.8234 |
x_1 | 0.3850 |
x_2 | 0.3380 |
x_3 | 0.4376 |
$P_0$
state | x_0 | x_1 | x_2 | x_3 |
---|---|---|---|---|
x_0 | 0.8860 | 0.4535 | 0.4490 | 0.2316 |
x_1 | 0.4535 | 1.8632 | 0.6740 | 1.3448 |
x_2 | 0.4490 | 0.6740 | 2.0374 | 1.2929 |
x_3 | 0.2316 | 1.3448 | 1.2929 | 2.0978 |
data
tensor([[[0.8775, nan, nan], [0.6706, nan, 0.9272], [ nan, 0.4967, nan], [ nan, nan, nan], [ nan, nan, 0.4760], [0.7606, 0.7759, 0.5243], [0.3714, 0.0426, 0.2343], [0.9991, 0.1775, nan], [0.6734, nan, 0.6468], [0.5825, 0.4599, 0.7960]], [[0.9038, 0.9735, 0.6428], [0.3725, 0.2052, nan], [0.4448, 0.5775, 0.7237], [0.5927, nan, 0.6441], [ nan, 0.9132, 0.0329], [0.4856, 0.9927, 0.5895], [0.2611, 0.9413, 0.1371], [0.8726, 0.5590, 0.8451], [0.1253, 0.9434, 0.0462], [0.2360, 0.0239, 0.8950]]], dtype=torch.float64)
mask
tensor([[[ True, False, False], [ True, False, True], [False, True, False], [False, False, False], [False, False, True], [ True, True, True], [ True, True, True], [ True, True, False], [ True, False, True], [ True, True, True]], [[ True, True, True], [ True, True, False], [ True, True, True], [ True, False, True], [False, True, True], [ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True]]])
control
tensor([[[0.9201, 0.6883, 0.5342], [0.7847, 0.3137, 0.1778], [0.5838, 0.9799, 0.3611], [0.3155, 0.7475, 0.5450], [0.5641, 0.2493, 0.8323], [0.9723, 0.1883, 0.3605], [0.5344, 0.3443, 0.7696], [0.3410, 0.7553, 0.3177], [0.0315, 0.5209, 0.6514], [0.3131, 0.4510, 0.3550]], [[0.4790, 0.0676, 0.3606], [0.7299, 0.6713, 0.3134], [0.7460, 0.1291, 0.4653], [0.5693, 0.9906, 0.8288], [0.9039, 0.5240, 0.6277], [0.3574, 0.0076, 0.6530], [0.8667, 0.9368, 0.8667], [0.6749, 0.3526, 0.6618], [0.0837, 0.7188, 0.7247], [0.3211, 0.4898, 0.9030]]], dtype=torch.float64)
Probability of state at time t
given state a time t-1
\(p(x_t) = \mathcal{N}(x_t; m_t^-, P_t^-)\) where:
predicted state mean: \(m_t^- = Am_{t-1} + B c_t + b\)
predicted state covariance: \(P_t^- = AP_{t-1}A^T + Q\)
(torch.Size([2, 4, 1]), torch.Size([2, 4, 4]), torch.Size([1, 4, 4]))
implement \(P_t^- = AP_{t-1}A^T + Q\)
tensor([[[6.2873, 3.7655, 4.8718, 5.8589],
[3.7655, 4.8551, 4.7314, 5.1009],
[4.8718, 4.7314, 5.9373, 6.0160],
[5.8589, 5.1009, 6.0160, 8.5954]],
[[6.2873, 3.7655, 4.8718, 5.8589],
[3.7655, 4.8551, 4.7314, 5.1009],
[4.8718, 4.7314, 5.9373, 6.0160],
[5.8589, 5.1009, 6.0160, 8.5954]]], dtype=torch.float64,
grad_fn=<AddBackward0>)
tensor([[0.7217],
[0.9572],
[0.5742],
[1.2138]], dtype=torch.float64, grad_fn=<MmBackward0>)
m_m
tensor([[[2.4986], [2.2429], [1.8833], [3.6164]], [[2.1377], [1.5964], [1.5371], [2.8315]]], dtype=torch.float64, grad_fn=)
P_m
tensor([[[6.2873, 3.7655, 4.8718, 5.8589], [3.7655, 4.8551, 4.7314, 5.1009], [4.8718, 4.7314, 5.9373, 6.0160], [5.8589, 5.1009, 6.0160, 8.5954]], [[6.2873, 3.7655, 4.8718, 5.8589], [3.7655, 4.8551, 4.7314, 5.1009], [4.8718, 4.7314, 5.9373, 6.0160], [5.8589, 5.1009, 6.0160, 8.5954]]], dtype=torch.float64, grad_fn=)
Probability of state at time t
given the observations at time t
\(p(x_t|y_t) = \mathcal{N}(x_t; m_t, P_t)\) where:
predicted obs mean: \(z_t = Hm_t^- + d\)
prediced obs covariance: \(S_t = HP_t^-H^T + R\)
kalman gain\(K_t = P_t^-H^TS_t^{-1}\)
corrected state mean: \(m_t = m_t^- + K_t(y_t - z_t)\)
corrected state covariance: \(P_t = (I-K_tH)P_t^-\)
if the observation are missing this step is skipped and the corrected state is equal to the predicted state
Need to figure out the Nans for the gradients …
Don’t compute the inverse of the matrix, but use cholesky_solve
to invert the matrix
tensor([[[ 0.1177, 0.2313, 0.0612],
[-0.5573, 0.6582, 0.0288],
[-0.3189, 0.5919, -0.0174],
[ 0.3269, 0.3056, -0.0378]],
[[ 0.1177, 0.2313, 0.0612],
[-0.5573, 0.6582, 0.0288],
[-0.3189, 0.5919, -0.0174],
[ 0.3269, 0.3056, -0.0378]]], dtype=torch.float64,
grad_fn=<TransposeBackward0>)
tensor([[[ 2.1074, -0.2158, 0.3602, 0.1911],
[-0.2158, 0.4808, 0.0472, -0.1503],
[ 0.3602, 0.0472, 0.8035, -0.0177],
[ 0.1911, -0.1503, -0.0177, 0.8448]],
[[ 2.1074, -0.2158, 0.3602, 0.1911],
[-0.2158, 0.4808, 0.0472, -0.1503],
[ 0.3602, 0.0472, 0.8035, -0.0177],
[ 0.1911, -0.1503, -0.0177, 0.8448]]], dtype=torch.float64,
grad_fn=<UnsafeViewBackward0>)
tensor([[[-3.3343],
[ nan],
[ nan]],
[[-2.4181],
[-5.0754],
[-3.9145]]], dtype=torch.float64, grad_fn=<SubBackward0>)
tensor([[[ nan],
[ nan],
[ nan],
[ nan]],
[[ 0.4396],
[-0.5094],
[-0.6276],
[ 0.6379]]], dtype=torch.float64, grad_fn=<AddBackward0>)
m
tensor([[[ nan], [ nan], [ nan], [ nan]], [[ 0.4396], [-0.5094], [-0.6276], [ 0.6379]]], dtype=torch.float64, grad_fn=)
P
tensor([[[ 2.1074, -0.2158, 0.3602, 0.1911], [-0.2158, 0.4808, 0.0472, -0.1503], [ 0.3602, 0.0472, 0.8035, -0.0177], [ 0.1911, -0.1503, -0.0177, 0.8448]], [[ 2.1074, -0.2158, 0.3602, 0.1911], [-0.2158, 0.4808, 0.0472, -0.1503], [ 0.3602, 0.0472, 0.8035, -0.0177], [ 0.1911, -0.1503, -0.0177, 0.8448]]], dtype=torch.float64, grad_fn=)
(torch.Size([2, 4, 1]), torch.Size([2, 4, 4]))
there are nan
in the output because there are nan
in the observations
The next functions adds the support for missing obsevations by also using a mask
If all the observations at time \(t\) are missing the correct step is skipped and the filtered state at time \(t\) () is the same of the filtered state.
If only some observations are missing a variation of equation can be used.
\(y^{ng}_t\) is a vector containing the observations that are not missing at time \(t\).
It can be expressed as a linear transformation of \(y_t\)
\[ y^{ng}_t = My_t\]
where \(M\) is a mask matrix that is used to select the subset of \(y_t\) that is observed. \(M \in \mathbb{R}^{n_{ng} \times n}\) and is made of columns which are made of all zeros but for an entry 1 at row corresponding to the non-missing observation. hence:
\[ p(y^{ng}_t) = \mathcal{N}(M\mu_{y_t}, M\Sigma_{y_t}M^T)\]
from which you can derive
\[ p(y^{ng}_t|x_t) = p(MHx_t + Mb, MRM^T) \tag{1}\]
Then the posterior \(p(x_t|y_t^{ng})\) can be computed similarly of equation @filter_correct as:
\[ p(x_t|y^{ng}_t) = \mathcal{N}(x_t; m_t, P_t) \tag{2}\]
where:
For the implementation the matrix multiplication \(MH\) can be replaced with H[m]
where m
is the mask for the rows for H
and \(MRM^T\) with R[m][:,m]
m = torch.tensor([False,True,True]) # mask batch
M = torch.tensor([[[0,1,0], # mask matrix
[0,0,1]]], dtype=torch.float64)
show_as_row(m, M, H, R)
m
tensor([False, True, True])
M
tensor([[[0., 1., 0.], [0., 0., 1.]]], dtype=torch.float64)
H
Parameter containing: tensor([[[0.1349, 0.0519, 0.1180, 0.9767], [0.1679, 0.8635, 0.3753, 0.9760], [0.2125, 0.8049, 0.2124, 0.6794]]], dtype=torch.float64, requires_grad=True)
R
tensor([[[1.6259, 1.1183, 0.8535], [1.1183, 1.2831, 0.9935], [0.8535, 0.9935, 2.4550]]], dtype=torch.float64, grad_fn=)
(tensor([[[0.1679, 0.8635, 0.3753, 0.9760],
[0.2125, 0.8049, 0.2124, 0.6794]]], dtype=torch.float64,
grad_fn=<UnsafeViewBackward0>),
tensor([[[0.1679, 0.8635, 0.3753, 0.9760],
[0.2125, 0.8049, 0.2124, 0.6794]]], dtype=torch.float64,
grad_fn=<IndexBackward0>))
(tensor([[[1.2831, 0.9935],
[0.9935, 2.4550]]], dtype=torch.float64,
grad_fn=<UnsafeViewBackward0>),
tensor([[[1.2831, 0.9935],
[0.9935, 2.4550]]], dtype=torch.float64, grad_fn=<IndexBackward0>))
By using partially missing observations _filter_update
cannot be easily batched as the shape of the intermediate variables depends on the number of observed variables. So the idea is to divide the batch in blocks that share the same number of variables missing.
(torch.Size([1, 1, 4]),
torch.Size([1, 1, 1]),
torch.Size([1, 1, 1]),
torch.Size([2, 1, 1]))
#0
tensor([[[0.7184], [0.7151], [0.0696], [1.1526]], [[0.8467], [0.4883], [0.2217], [1.0446]]], dtype=torch.float64, grad_fn=)
#1
tensor([[[2.3679, 0.4016, 0.8785, 0.4342], [0.4016, 1.9680, 1.3041, 0.4451], [0.8785, 1.3041, 1.8687, 0.4890], [0.4342, 0.4451, 0.4890, 1.0874]], [[2.3679, 0.4016, 0.8785, 0.4342], [0.4016, 1.9680, 1.3041, 0.4451], [0.8785, 1.3041, 1.8687, 0.4890], [0.4342, 0.4451, 0.4890, 1.0874]]], dtype=torch.float64, grad_fn=)
m, P = _filter_update_mask_batch(H, d, R, m_m, P_m, obs, mask[:,0,:] )
show_as_row(m, P)
m.shape, P.shape
m
tensor([[[ 0.7184], [ 0.7151], [ 0.0696], [ 1.1526]], [[ 0.4396], [-0.5094], [-0.6276], [ 0.6379]]], dtype=torch.float64, grad_fn=)
P
tensor([[[ 2.3679, 0.4016, 0.8785, 0.4342], [ 0.4016, 1.9680, 1.3041, 0.4451], [ 0.8785, 1.3041, 1.8687, 0.4890], [ 0.4342, 0.4451, 0.4890, 1.0874]], [[ 2.1074, -0.2158, 0.3602, 0.1911], [-0.2158, 0.4808, 0.0472, -0.1503], [ 0.3602, 0.0472, 0.8035, -0.0177], [ 0.1911, -0.1503, -0.0177, 0.8448]]], dtype=torch.float64, grad_fn=)
(torch.Size([2, 4, 1]), torch.Size([2, 4, 4]))
The resursive version of the kalman filter is apperently breaking pytorch gradients calculations so a workaround is needed. During the loop the states are saved in a python list and then at the end they are combined back into a tensor. The last line of the function does:
Predictions at time 0
for both batches
#0
torch.Size([2, 10, 4, 1])
#1
torch.Size([2, 10, 4, 4])
#2
torch.Size([2, 10, 4, 1])
#3
torch.Size([2, 10, 4, 4])
#0
tensor([[0.6498], [0.3257], [0.8462], [0.7727]], dtype=torch.float64, grad_fn=)
#1
tensor([[0.7511, 0.6229, 0.6414, 0.4561], [0.6229, 1.2138, 0.6974, 0.9224], [0.6414, 0.6974, 1.4388, 1.4051], [0.4561, 0.9224, 1.4051, 3.1638]], dtype=torch.float64, grad_fn=)
#2
tensor([[0.6392], [0.3073], [0.8191], [0.7180]], dtype=torch.float64, grad_fn=)
#3
tensor([[0.6695, 0.4821, 0.4340, 0.0368], [0.4821, 0.9707, 0.3394, 0.1987], [0.4340, 0.3394, 0.9115, 0.3391], [0.0368, 0.1987, 0.3391, 1.0092]], dtype=torch.float64, grad_fn=)
The filter methods wraps _filter_all
but in addition:
KalmanFilter.filter (obs:torch.Tensor, mask:torch.Tensor, control:torch.Tensor)
Filter observation
Type | Details | |
---|---|---|
obs | Tensor | ([n_batches], n_obs, [self.n_dim_obs]) where n_batches and n_dim_obs dimensions can be omitted if 1 |
mask | Tensor | ([n_batches], n_obs, [self.n_dim_obs]) where n_batches and n_dim_obs dimensions can be omitted if 1 |
control | Tensor | ([n_batches], n_obs, [self.n_dim_contr]) |
Returns | ListMultiNormal | Filtered state (n_batches, n_obs, self.n_dim_state) |
compute the probability of the state at time t
given all the observations
\(p(x_t|Y) = \mathcal{N}(x_t; m_t^s, P_t^s)\) where:
tensor([[[ 0.6254],
[ 0.2922],
[ 0.7965],
[ 0.6964]],
[[-0.1829],
[-0.7088],
[-0.0211],
[ 0.2570]]], dtype=torch.float64, grad_fn=<AddBackward0>)
tensor([[[ 0.5327, 0.3321, 0.2094, -0.1773],
[ 0.3321, 0.8061, 0.0930, -0.0363],
[ 0.2094, 0.0930, 0.5427, -0.0125],
[-0.1773, -0.0363, -0.0125, 0.6738]],
[[ 0.3342, 0.0671, 0.0652, -0.2001],
[ 0.0671, 0.3478, -0.1115, -0.1245],
[ 0.0652, -0.1115, 0.4549, -0.0025],
[-0.2001, -0.1245, -0.0025, 0.6981]]], dtype=torch.float64,
grad_fn=<DivBackward0>)
#0
tensor([[[ 0.6254], [ 0.2922], [ 0.7965], [ 0.6964]], [[-0.1829], [-0.7088], [-0.0211], [ 0.2570]]], dtype=torch.float64, grad_fn=)
#1
tensor([[[ 0.5327, 0.3321, 0.2094, -0.1773], [ 0.3321, 0.8061, 0.0930, -0.0363], [ 0.2094, 0.0930, 0.5427, -0.0125], [-0.1773, -0.0363, -0.0125, 0.6738]], [[ 0.3342, 0.0671, 0.0652, -0.2001], [ 0.0671, 0.3478, -0.1115, -0.1245], [ 0.0652, -0.1115, 0.4549, -0.0025], [-0.2001, -0.1245, -0.0025, 0.6981]]], dtype=torch.float64, grad_fn=)
#0
tensor([[-0.0785], [-0.5839], [-0.0959], [ 0.0031]], dtype=torch.float64, grad_fn=)
#1
tensor([[ 0.4578, 0.2300, 0.1755, -0.1481], [ 0.2300, 0.6676, 0.0346, -0.0171], [ 0.1755, 0.0346, 0.5665, 0.0642], [-0.1481, -0.0171, 0.0642, 0.7652]], dtype=torch.float64, grad_fn=)
KalmanFilter.smooth (obs:torch.Tensor, mask:torch.Tensor, control:torch.Tensor)
Kalman Filter Smoothing
Type | Details | |
---|---|---|
obs | Tensor | |
mask | Tensor | |
control | Tensor | |
Returns | ListMultiNormal | [n_timesteps, n_dim_state] smoothed state |
#0
torch.Size([2, 10, 4, 1])
#1
torch.Size([2, 10, 4, 4])
The prediction at time t (\(y_t\)) are computed rom the state (\(x_t\)) using this formula: \[p(y_t|x_t) = \mathcal{N}(Hx_t + d, R + HP^s_tH^T)\]
this works both if the state was filtered or smoother
This add the supports for conditional predictions, which means that at the time (t) when we are making the predictions some of the variables have been actually observed. Since the model prediction is a normal distribution we can condition on the observed values and thus improve the predictions. See conditional_gaussian
In order to have conditional predictions that make sense it’s not possible to return the full covariance matrix for the predictions but only the standard deviations
predict can be vectorized across both the batch and the timesteps, except for timesteps that require conditional predictions
(torch.Size([2, 10, 4, 1]), torch.Size([2, 10, 4, 4]))
KalmanFilter.predict (obs, mask, control, smooth=True)
Predicted observations at all times
Gradients …
tensor([[[ -2.5082, 0.0000, 0.0000],
[-11.7667, -0.1206, 0.0000],
[ -7.6345, -8.3390, -4.0905]]], dtype=torch.float64)
KalmanFilter.predict (obs, mask, control, smooth=True)
Predicted observations at all times
@patch
def predict_times(self: KalmanFilter, times, obs, mask=None, smooth=True, check_args=None):
"""Predicted observations at specific times """
state = self.smooth(obs, mask, check_args) if smooth else self.filter(obs, mask, check_args)
obs, mask = self._parse_obs(obs, mask)
times = array1d(times)
n_timesteps = obs.shape[0]
n_features = obs.shape[1] if len(obs.shape) > 1 else 1
if times.max() > n_timesteps or times.min() < 0:
raise ValueError(f"provided times range from {times.min()} to {times.max()}, which is outside allowed range : 0 to {n_timesteps}")
means = torch.empty((times.shape[0], n_features), dtype=obs.dtype, device=obs.device)
stds = torch.empty((times.shape[0], n_features), dtype=obs.dtype, device=obs.device)
for i, t in enumerate(times):
mean, std = self._obs_from_state(
state.mean[t],
state.cov[t],
{'t': t, **check_args} if check_args is not None else None
)
means[i], stds[i] = _get_cond_pred(ListNormal(mean, std), obs[t], mask[t])
return ListNormal(means, stds)
Implement the numerical stable version of the covariance update
tensor([[[1.9504, 2.3535, 2.1727, 2.2936],
[2.3535, 5.8436, 5.0464, 6.0831],
[2.1727, 5.0464, 6.2940, 6.0969],
[2.2936, 6.0831, 6.0969, 8.1538]],
[[1.9504, 2.3535, 2.1727, 2.2936],
[2.3535, 5.8436, 5.0464, 6.0831],
[2.1727, 5.0464, 6.2940, 6.0969],
[2.2936, 6.0831, 6.0969, 8.1538]]], dtype=torch.float64,
grad_fn=<AddBackward0>)
\[W = \begin{bmatrix}AC_{t-1}&C_Q\end{bmatrix}\]
tensor([[[-2.5709, 0.0000, 0.0000, 0.0000],
[-1.7904, 1.1590, 0.0000, 0.0000],
[-1.9907, 1.3402, 1.2995, 0.0000],
[-2.2985, 0.5614, 0.8125, -1.0728]],
[[-2.5709, 0.0000, 0.0000, 0.0000],
[-1.7904, 1.1590, 0.0000, 0.0000],
[-1.9907, 1.3402, 1.2995, 0.0000],
[-2.2985, 0.5614, 0.8125, -1.0728]]], dtype=torch.float64,
grad_fn=<TransposeBackward0>)
tensor([[[6.6096, 4.6031, 5.1178, 5.9093],
[4.6031, 4.5489, 5.1174, 4.7660],
[5.1178, 5.1174, 7.4476, 6.3838],
[5.9093, 4.7660, 6.3838, 7.4094]],
[[6.6096, 4.6031, 5.1178, 5.9093],
[4.6031, 4.5489, 5.1174, 4.7660],
[5.1178, 5.1174, 7.4476, 6.3838],
[5.9093, 4.7660, 6.3838, 7.4094]]], dtype=torch.float64,
grad_fn=<UnsafeViewBackward0>)
tensor(8.8818e-16, dtype=torch.float64, grad_fn=<MaxBackward1>)
Cholesky decomposition is not unique! but the solution is correct
def fuzz_filter_predict_cov_SR(n=10):
for _ in range(n):
kSR = KalmanFilterSR.init_random(5,10,4)
A, Q_C, b, B, m_pr,P_pr = (kSR.A.unsqueeze(0), kSR.Q_C.unsqueeze(0), kSR.b.unsqueeze(-1),
kSR.B.unsqueeze(0),
torch.stack([kSR.m0]*2).unsqueeze(-1),
torch.stack([kSR.P0]*2))
P_pr_C = torch.linalg.cholesky(P_pr)
P_m_C = _filter_predict_cov_SR(A, Q_C, P_pr_C)
test_close(P_m_C @ P_m_C.mT, _filter_predict_cov_stand(A, Q_C @ Q_C.mT, P_pr), eps=5e-13)
m_m, P_m_C = _filter_predict_SR(kSR.A, kSR.Q_C, kSR.b, kSR.B, m_pr, P_pr_C, control[:,0].unsqueeze(-1))
show_as_row(m_m, P_m_C)
m_m
tensor([[[1.9549], [2.4811], [3.3275], [3.5545]], [[1.1647], [2.3109], [2.4892], [2.8133]]], dtype=torch.float64, grad_fn=)
P_m_C
tensor([[[-1.3966, 0.0000, 0.0000, 0.0000], [-1.6852, -1.7331, 0.0000, 0.0000], [-1.5557, -1.3991, 1.3843, 0.0000], [-1.6423, -1.9130, 0.6253, -1.1858]], [[-1.3966, 0.0000, 0.0000, 0.0000], [-1.6852, -1.7331, 0.0000, 0.0000], [-1.5557, -1.3991, 1.3843, 0.0000], [-1.6423, -1.9130, 0.6253, -1.1858]]], dtype=torch.float64, grad_fn=)
cat_2d (x)
Details | |
---|---|
x | matrix as list of list of Tensor |
H, R, R_C, obs = kSR.H, kSR.R, kSR.R_C, data[:,0,:].unsqueeze(-1)
P_m = P_m_C @ P_m_C.mT
# use standard filter to compute expected result
S = H @ P_m @ H.mT + R
S_C = torch.linalg.cholesky(S)
K = _filter_update_k_gain(H, R, P_m)
K_bar = K @ S_C
P_stand = _filter_update_cov(H, K, P_m)
P_C_stand = torch.linalg.cholesky(P_stand)
\[M = \begin{bmatrix} R^{1/2} & H(P^-)^{1/2} \\ 0 & (P^-)^{1/2} \end{bmatrix}\]
tensor([[ 1.2478, 0.0000, 0.0000, -1.2662, -1.1753, 0.7242, -0.3448],
[ 0.3190, 1.2801, 0.0000, -2.1349, -1.5926, 1.0952, -0.3843],
[ 0.0341, 0.9601, 0.7766, -3.2461, -2.1290, 0.8250, -0.5179],
[ 0.0000, 0.0000, 0.0000, -1.3966, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, -1.6852, -1.7331, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, -1.5557, -1.3991, 1.3843, 0.0000],
[ 0.0000, 0.0000, 0.0000, -1.6423, -1.9130, 0.6253, -1.1858]],
dtype=torch.float64, grad_fn=<SelectBackward0>)
\[V = \begin{bmatrix} S^{1/2} & 0 \\ \bar{K} & P^{1/2} \end{bmatrix}\]
tensor([[ 2.2770, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 2.5904, 1.8631, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 3.2634, 2.2593, 1.3379, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.7766, 0.5205, 0.6151, 0.8355, 0.0000, 0.0000, 0.0000],
[ 1.8316, 0.8659, 0.9168, -0.1000, 0.9427, 0.0000, 0.0000],
[ 2.0275, 0.9734, 0.2654, -0.0859, 0.2527, 1.0461, 0.0000],
[ 2.2790, 0.9606, 0.6923, -0.4813, 0.4183, 0.2013, 1.0540]],
dtype=torch.float64, grad_fn=<SelectBackward0>)
tensor([[-2.2770, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[-2.5904, -1.8631, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[-3.2634, -2.2593, -1.3379, 0.0000, 0.0000, 0.0000, 0.0000],
[-0.7766, -0.5205, -0.6151, 0.8355, 0.0000, 0.0000, 0.0000],
[-1.8316, -0.8659, -0.9168, -0.1000, 0.9427, 0.0000, 0.0000],
[-2.0275, -0.9734, -0.2654, -0.0859, 0.2527, -1.0461, 0.0000],
[-2.2790, -0.9606, -0.6923, -0.4813, 0.4183, -0.2013, -1.0540]],
dtype=torch.float64, grad_fn=<SelectBackward0>)
tensor([[[ 0.8355, 0.0000, 0.0000, 0.0000],
[-0.1000, 0.9427, 0.0000, 0.0000],
[-0.0859, 0.2527, -1.0461, 0.0000],
[-0.4813, 0.4183, -0.2013, -1.0540]],
[[ 0.8355, 0.0000, 0.0000, 0.0000],
[-0.1000, 0.9427, 0.0000, 0.0000],
[-0.0859, 0.2527, -1.0461, 0.0000],
[-0.4813, 0.4183, -0.2013, -1.0540]]], dtype=torch.float64,
grad_fn=<SliceBackward0>)
\(P_C\) computed with the QR decomposition is not the same from the cholesky decomposition, but that’s okay! They are all valid square roots of the posterior covariance
tensor_info (x)
def fuzz_filter_update_cov_SR(n=10):
errs = []
for _ in range(n):
kSR = KalmanFilterSR.init_random(5,10,4)
H, R_C, P_m = (kSR.H, kSR.R_C, torch.cat([kSR.P0]*5))
R = R_C @ R_C.mT
P_m_C = torch.linalg.cholesky(P_m)
P_C, _ = _filter_update_cov_SR(H, R_C, P_m_C)
K = _filter_update_k_gain(H, R, P_m)
P_stand = _filter_update_cov(H, K, P_m)
errs.append((P_C @ P_C.mT - P_stand).abs().max().item())
return torch.tensor(errs)
(tensor(2.8588e-15), tensor(1.0214e-14))
Don’t compute the inverse of the matrix, but use cholesky_solve
to invert the matrix
tensor([[[-0.0014, -0.2782, 0.4598],
[ 0.2389, -0.3662, 0.6852],
[ 0.2854, 0.2819, 0.1984],
[ 0.3866, -0.1119, 0.5174]],
[[-0.0014, -0.2782, 0.4598],
[ 0.2389, -0.3662, 0.6852],
[ 0.2854, 0.2819, 0.1984],
[ 0.3866, -0.1119, 0.5174]]], dtype=torch.float64,
grad_fn=<TransposeBackward0>)
def fuzz_kalman_gain_SR(n=10):
errs = []
for _ in range(n):
kSR = KalmanFilterSR.init_random(5,10,4)
H, R_C, P_m = (kSR.H, kSR.R_C, torch.cat([kSR.P0]*5))
R = R_C @ R_C.mT
P_m_C = torch.linalg.cholesky(P_m)
P_C, S_C = _filter_update_cov_SR(H, R_C, P_m_C)
K_stand = _filter_update_k_gain(H, R, P_m)
K = _filter_update_k_gain_SR(H, P_m_C, S_C)
errs.append((K_stand - K).abs().max().item())
return torch.tensor(errs)
m
tensor([[[ nan], [ nan], [ nan], [ nan]], [[0.1197], [0.3457], [0.5033], [0.6044]]], dtype=torch.float64, grad_fn=)
P_C
tensor([[[ 0.8355, 0.0000, 0.0000, 0.0000], [-0.1000, 0.9427, 0.0000, 0.0000], [-0.0859, 0.2527, -1.0461, 0.0000], [-0.4813, 0.4183, -0.2013, -1.0540]], [[ 0.8355, 0.0000, 0.0000, 0.0000], [-0.1000, 0.9427, 0.0000, 0.0000], [-0.0859, 0.2527, -1.0461, 0.0000], [-0.4813, 0.4183, -0.2013, -1.0540]]], dtype=torch.float64, grad_fn=)
(torch.Size([2, 4, 1]), torch.Size([2, 4, 4]))
def fuzz_filter_update_SR(n=10):
errs = {'mean': [], 'cov': []}
for _ in range(n):
kSR = KalmanFilterSR.init_random(5,10,4)
H, d, R, R_C, m_m, P_m = (kSR.H, kSR.d, kSR.R, kSR.R_C, torch.cat([kSR.m0]*5), torch.cat([kSR.P0]*5))
obs = torch.randn_like(H @ m_m)
P_m_C = torch.linalg.cholesky(P_m)
mSR, P_m_C = _filter_update_SR(H, d, R_C, m_m, P_m_C, obs)
m, P_C = _filter_update(H, d, R, m_m, P_m, obs)
errs['mean'].append((mSR - m).abs().max().item())
errs['cov'].append((P_C - P_m_C @ P_m_C.mT).abs().max().item())
return pd.DataFrame(errs)
err = fuzz_filter_update_SR(100)
err.median(), err.max()
(mean 1.049161e-14
cov 2.740863e-15
dtype: float64,
mean 5.573320e-14
cov 1.099121e-14
dtype: float64)
Here need to compute the square root of \(R\), because cannot apply the mask to \(R^{1/2}\)
tensor([[[1.5571, 0.3980, 0.0426],
[0.3980, 1.7403, 1.2398],
[0.0426, 1.2398, 1.5260]]], dtype=torch.float64,
grad_fn=<UnsafeViewBackward0>)
_filter_update_SR(H_m, d_m, R_C_m, m_m, P_m_C, obs_m)[0] - _filter_update(H_m, d_m, R_m, m_m, P_m_C @ P_m_C.mT, obs_m)[0]
tensor([[[ nan],
[ nan],
[ nan],
[ nan]],
[[-0.1152],
[-0.1834],
[-0.1472],
[-0.1785]]], dtype=torch.float64, grad_fn=<SubBackward0>)
_filter_update_SR(H_m, d_m, R2_C_m, m_m, P_m_C, obs_m)[0] - _filter_update(H_m, d_m, R_m, m_m, P_m_C @ P_m_C.mT, obs_m)[0]
tensor([[[ nan],
[ nan],
[ nan],
[ nan]],
[[-4.4409e-16],
[-4.4409e-16],
[ 0.0000e+00],
[-8.8818e-16]]], dtype=torch.float64, grad_fn=<SubBackward0>)
#0
tensor([[[1.3313], [1.0104], [1.6996], [1.7246]], [[0.7732], [1.3875], [1.4670], [1.6642]]], dtype=torch.float64, grad_fn=)
#1
tensor([[[ 1.1441, 0.0000, 0.0000, 0.0000], [ 0.7349, 1.3175, 0.0000, 0.0000], [ 0.4355, 0.5900, -1.1768, 0.0000], [ 0.3596, 1.0472, -0.3473, -1.1330]], [[ 1.1441, 0.0000, 0.0000, 0.0000], [ 0.7349, 1.3175, 0.0000, 0.0000], [ 0.4355, 0.5900, -1.1768, 0.0000], [ 0.3596, 1.0472, -0.3473, -1.1330]]], dtype=torch.float64, grad_fn=)
(torch.Size([2, 4, 1]), torch.Size([2, 4, 4]))
def fuzz_filter_update_SR(n=10):
errs = {'mean': [], 'cov': []}
for _ in range(n):
kSR = KalmanFilterSR.init_random(5,10,4)
H, d, R, R_C, m_m, P_m = (kSR.H, kSR.d, kSR.R, kSR.R_C, torch.cat([kSR.m0]*5), torch.cat([kSR.P0]*5))
obs, mask, _ = get_test_data(1, 5, 4, bs=5)
obs, mask = obs[:,0].unsqueeze(-1), mask[0,0]
P_m_C = torch.linalg.cholesky(P_m)
mSR, P_m_C = _filter_update_mask_SR(H, d, R, m_m, P_m_C, obs, mask)
m, P_C = _filter_update_mask(H, d, R, m_m, P_m, obs, mask)
errs['mean'].append((mSR - m).abs().max().item())
errs['cov'].append((P_C - P_m_C @ P_m_C.mT).abs().max().item())
return pd.DataFrame(errs)
err = fuzz_filter_update_SR(100)
err.median(), err.max()
(mean 3.996803e-15
cov 1.970646e-15
dtype: float64,
mean 3.996803e-15
cov 7.827072e-15
dtype: float64)
m, P_C = _filter_update_mask_batch_SR(H, d, R, m_m, P_m_C, obs, mask[:,0,:] )
show_as_row(m, P_C)
m.shape, P_C.shape
m
tensor([[[1.3685], [1.0982], [1.7967], [1.8337]], [[0.1197], [0.3457], [0.5033], [0.6044]]], dtype=torch.float64, grad_fn=)
P_C
tensor([[[ 1.1607, 0.0000, 0.0000, 0.0000], [ 0.8022, 1.3585, 0.0000, 0.0000], [ 0.5153, 0.6769, -1.2081, 0.0000], [ 0.4512, 1.1387, -0.3915, -1.1430]], [[ 0.8355, 0.0000, 0.0000, 0.0000], [-0.1000, 0.9427, 0.0000, 0.0000], [-0.0859, 0.2527, -1.0461, 0.0000], [-0.4813, 0.4183, -0.2013, -1.0540]]], dtype=torch.float64, grad_fn=)
(torch.Size([2, 4, 1]), torch.Size([2, 4, 4]))
The resursive version of the kalman filter is apperently breaking pytorch gradients calculations so a workaround is needed. During the loop the states are saved in a python list and then at the end they are combined back into a tensor. The last line of the function does:
Predictions at time 0
for both batches
#0
torch.Size([2, 10, 4, 1])
#1
torch.Size([2, 10, 4, 4])
#2
torch.Size([2, 10, 4, 1])
#3
torch.Size([2, 10, 4, 4])
#0
tensor([[0.4499], [0.8575], [0.2647], [0.4293]], dtype=torch.float64, grad_fn=)
#1
tensor([[1.2562, 0.0000, 0.0000, 0.0000], [0.3804, 0.9666, 0.0000, 0.0000], [0.9536, 0.7909, 1.2431, 0.0000], [0.4487, 0.4993, 0.4600, 1.2599]], dtype=torch.float64, grad_fn=)
#2
tensor([[0.5189], [0.9208], [0.4203], [0.5421]], dtype=torch.float64, grad_fn=)
#3
tensor([[-1.1638, 0.0000, 0.0000, 0.0000], [-0.2344, -0.9142, 0.0000, 0.0000], [-0.5963, -0.5742, -1.1218, 0.0000], [-0.1702, -0.3039, -0.2622, 1.2088]], dtype=torch.float64, grad_fn=)
The filter methods wraps _filter_all
but in addition:
KalmanFilterSR.filter (obs:torch.Tensor, mask:torch.Tensor, control:torch.Tensor)
Filter observation
Type | Details | |
---|---|---|
obs | Tensor | ([n_batches], n_obs, [self.n_dim_obs]) where n_batches and n_dim_obs dimensions can be omitted if 1 |
mask | Tensor | ([n_batches], n_obs, [self.n_dim_obs]) where n_batches and n_dim_obs dimensions can be omitted if 1 |
control | Tensor | ([n_batches], n_obs, [self.n_dim_contr]) |
Returns | ListMultiNormal | Filtered state (n_batches, n_obs, self.n_dim_state) |
(torch.Size([2, 10, 4, 1]), torch.Size([2, 10, 4, 4]))
def fuzz_filter_SR(n=10):
errs = {'mean': [], 'cov': []}
for _ in range(n):
kSR = KalmanFilterSR.init_random(5,10,4)
k = KalmanFilter.init_from(kSR)
dat = get_test_data(20, 5, 4)
mean, cov = k.filter(*dat)
meanSR, covSR = kSR.filter(*dat)
covSR = covSR @ covSR.mT
errs['mean'].append((meanSR - mean).abs().max().item())
errs['cov'].append((covSR - cov).abs().max().item())
return pd.DataFrame(errs)
err = fuzz_filter_SR(10)
err.median(), err.max()
(mean 8.108847e-12
cov 8.149592e-12
dtype: float64,
mean 4.446576e-11
cov 1.261309e-10
dtype: float64)
compute the probability of the state at time t
given all the observations
\(p(x_t|Y) = \mathcal{N}(x_t; m_t^s, P_t^s)\) where:
def fuzz_gain_SR(n=10):
errs = {'K': []}
for _ in range(n):
kSR = KalmanFilterSR.init_random(5,10,4)
k = KalmanFilter.init_from(kSR)
dat = get_test_data(2, 5, 4)
(_, f_cov), (_, p_cov) = k._filter_all(*dat)
(_, f_covSR), (_, p_covSR) = kSR._filter_all(*dat)
K = _smooth_gain(k.A, f_cov[:, 0], p_cov[:, 0])
K_SR = _smooth_gain_SR(k.A, f_covSR[:, 0], p_covSR[:, 0])
errs['K'].append((K - K_SR).abs().max().item())
return pd.DataFrame(errs)
err = fuzz_gain_SR(10)
err.median(), err.max()
(K 1.920686e-14
dtype: float64,
K 4.352074e-14
dtype: float64)
MultiNormal(mean=tensor([[[0.6325],
[0.9899],
[0.5680],
[0.6284]]], dtype=torch.float64, grad_fn=<AddBackward0>), cov=tensor([[[ 0.7480, -0.0959, -0.0944, -0.2622],
[-0.0959, 0.6667, 0.1856, 0.0380],
[-0.0944, 0.1856, 0.9192, -0.0280],
[-0.2622, 0.0380, -0.0280, 1.3021]]], dtype=torch.float64,
grad_fn=<DivBackward0>))
MultiNormal(mean=tensor([[[0.6325],
[0.9899],
[0.5680],
[0.6284]]], dtype=torch.float64, grad_fn=<AddBackward0>), cov=tensor([[[-1.8502, -2.4971, -5.2813, -3.0804],
[-2.4971, -1.6908, -4.8194, -2.8887],
[-5.2813, -4.8194, -9.7933, -6.0797],
[-3.0804, -2.8887, -6.0797, -2.6434]]], dtype=torch.float64,
grad_fn=<DivBackward0>))
def fuzz_update_SR(n=10):
errs = {'mean': [], 'cov': []}
for _ in range(n):
kSR = KalmanFilterSR.init_random(5,10,4)
k = KalmanFilter.init_from(kSR)
dat = get_test_data(2, 5, 4)
f_state, p_state = k._filter_all(*dat)
f_stateSR, p_stateSR = kSR._filter_all(*dat)
mean, cov = _smooth_update(k.A, f_state[:, 0], p_state[:, 0], f_state[:, 1])
meanSR, covSR = _smooth_update_SR(k.A, f_stateSR[:, 0], p_stateSR[:, 0], f_state[:, 1])
errs['mean'].append((meanSR - mean).abs().max().item())
errs['cov'].append((covSR - cov).abs().max().item())
return pd.DataFrame(errs)
err = fuzz_update_SR(10)
err.median(), err.max()
(mean 1.154632e-14
cov 2.771117e-13
dtype: float64,
mean 3.375078e-14
cov 8.242296e-13
dtype: float64)
s_mean, s_cov = _smooth(kSR.A, filt_state, pred_state)
s_meanSR, s_covSR = _smooth_SR(kSR.A, filt_stateSR, pred_stateSR)
test_close(s_cov, s_covSR)
(s_cov - s_covSR).median(), (s_cov - s_covSR).max()
(tensor(0., dtype=torch.float64, grad_fn=<MedianBackward0>),
tensor(3.5527e-15, dtype=torch.float64, grad_fn=<MaxBackward1>))
#0
tensor([[-0.3279], [ 0.4029], [-0.7501], [-0.0098]], dtype=torch.float64, grad_fn=)
#1
tensor([[ 0.9163, -0.0046, 0.0978, -0.1494], [-0.0046, 0.7132, 0.2841, 0.0932], [ 0.0978, 0.2841, 1.1262, 0.0916], [-0.1494, 0.0932, 0.0916, 1.3623]], dtype=torch.float64, grad_fn=)
tensor([[[ True, False, False],
[ True, False, True],
[False, True, False],
[False, False, False],
[False, False, True],
[ True, True, True],
[ True, True, True],
[ True, True, False],
[ True, False, True],
[ True, True, True]],
[[ True, True, True],
[ True, True, False],
[ True, True, True],
[ True, False, True],
[False, True, True],
[ True, True, True],
[ True, True, True],
[ True, True, True],
[ True, True, True],
[ True, True, True]]])
KalmanFilterSR.smooth (obs:torch.Tensor, mask:torch.Tensor, control:torch.Tensor)
Kalman Filter Smoothing
Type | Details | |
---|---|---|
obs | Tensor | |
mask | Tensor | |
control | Tensor | |
Returns | ListMultiNormal | [n_timesteps, n_dim_state] smoothed state |
#0
torch.Size([2, 10, 4, 1])
#1
torch.Size([2, 10, 4, 4])
tensor([[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]]], dtype=torch.float64)
(smoothed_state.mean - smoothed_state_stand.mean).max()
(smoothed_state.cov - smoothed_state_stand.cov).max()
tensor(0.6755, dtype=torch.float64, grad_fn=<MaxBackward1>)
def fuzz_smooth_SR(n=10):
errs = {'mean': [], 'cov': []}
for _ in range(n):
kSR = KalmanFilterSR.init_random(5,10,4)
k = KalmanFilter.init_from(kSR)
dat = get_test_data(20, 5, 4)
mean, cov = k.smooth(*dat)
meanSR, covSR = kSR.smooth(*dat)
errs['mean'].append((meanSR - mean).abs().max().item())
errs['cov'].append((covSR - cov).abs().max().item())
return pd.DataFrame(errs)
err = fuzz_smooth_SR(10)
err.median(), err.max()
(mean 9.733464e-12
cov 8.838374e-12
dtype: float64,
mean 4.984400e-09
cov 4.896766e-09
dtype: float64)
predict can be vectorized across both the batch and the timesteps, except for timesteps that require conditional predictions
(torch.Size([2, 10, 4, 1]), torch.Size([2, 10, 4, 4]))
(torch.Size([2, 10, 3, 1]), torch.Size([2, 10, 3, 3]))
Predict has various modes:
pred_only_gap
is True, returns predictions only where the mask is False
use_conditional
returns a list (for each batch) of list (for each time stamp) of Tensors of shape [1, gap_len]use_conditional
is False, returns a list (for each batch) of Tensor of shape [n_times_gap, n_dim_obs][tensor([False, False]),
tensor([False]),
tensor([False, False]),
tensor([False, False, False]),
tensor([False, False]),
tensor([False]),
tensor([False])]
'[tensor([False, False]), tensor([False]), tensor([False, False]), tensor([False, False, False]), tensor([False, False]), tensor([False]), tensor([False])]'
all_mask
tensor([[ True, False, False], [ True, False, True], [False, True, False], [False, False, False], [False, False, True], [ True, True, True], [ True, True, True], [ True, True, False], [ True, False, True], [ True, True, True]])
only_gap
[tensor([False, False]), tensor([False]), tensor([False, False]), tensor([False, False, False]), tensor([False, False]), tensor([False]), tensor([False])]
KalmanFilterSR.predict (obs, mask, control, smooth=True)
Predicted observations at all times
with_settings (k, **kwargs)
replacing_ctx (*args)
with with_settings(kSR, use_conditional=False, pred_only_gap=False):
pred_mean, pred_cov = kSR.predict(data, mask, control)
show_as_row(mean= pred_mean.shape, cov = pred_cov.shape)
mean
torch.Size([2, 10, 3])
cov
torch.Size([2, 10, 3, 3])
with with_settings(kSR, use_conditional=False, pred_only_gap=True):
pred_mean, pred_cov = kSR.predict(data, mask, control)
show_as_row(mean= pred_mean[0], cov = pred_cov[0])
mean
[tensor([-0.3666, 0.3765], dtype=torch.float64, grad_fn=), tensor([-0.0844], dtype=torch.float64, grad_fn= ), tensor([0.1625, 0.3119], dtype=torch.float64, grad_fn= ), tensor([-0.1102, -0.5474, -0.1901], dtype=torch.float64, grad_fn= ), tensor([-0.1439, -0.4996], dtype=torch.float64, grad_fn= ), tensor([0.4966], dtype=torch.float64, grad_fn= ), tensor([0.0202], dtype=torch.float64, grad_fn= )]
cov
[tensor([[2.5518, 2.1682], [2.1682, 2.8766]], dtype=torch.float64, grad_fn=), tensor([[2.3344]], dtype=torch.float64, grad_fn= ), tensor([[1.9095, 0.5371], [0.5371, 3.0221]], dtype=torch.float64, grad_fn= ), tensor([[2.0929, 1.1465, 0.8395], [1.1465, 2.8696, 2.5367], [0.8395, 2.5367, 3.5092]], dtype=torch.float64, grad_fn= ), tensor([[1.9076, 0.8393], [0.8393, 2.3729]], dtype=torch.float64, grad_fn= ), tensor([[2.7991]], dtype=torch.float64, grad_fn= ), tensor([[2.2841]], dtype=torch.float64, grad_fn= )]
with with_settings(kSR, use_conditional=True, pred_only_gap=True):
pred_mean, pred_cov = kSR.predict(data, mask, control)
show_as_row(mean= pred_mean[0], cov = pred_cov[0])
mean
[tensor([0.0292, 0.6235], dtype=torch.float64, grad_fn=), tensor([0.4156], dtype=torch.float64, grad_fn= ), tensor([0.3672, 0.7999], dtype=torch.float64, grad_fn= ), tensor([-0.1102, -0.5474, -0.1901], dtype=torch.float64, grad_fn= ), tensor([-0.0686, -0.1575], dtype=torch.float64, grad_fn= ), tensor([0.4823], dtype=torch.float64, grad_fn= ), tensor([0.1720], dtype=torch.float64, grad_fn= )]
cov
[tensor([[2.5518, 2.1682], [2.1682, 2.8766]], dtype=torch.float64, grad_fn=), tensor([[2.3344]], dtype=torch.float64, grad_fn= ), tensor([[1.9095, 0.5371], [0.5371, 3.0221]], dtype=torch.float64, grad_fn= ), tensor([[2.0929, 1.1465, 0.8395], [1.1465, 2.8696, 2.5367], [0.8395, 2.5367, 3.5092]], dtype=torch.float64, grad_fn= ), tensor([[1.9076, 0.8393], [0.8393, 2.3729]], dtype=torch.float64, grad_fn= ), tensor([[2.7991]], dtype=torch.float64, grad_fn= ), tensor([[2.2841]], dtype=torch.float64, grad_fn= )]
copy paste from other notebook to make visualization easier
buffer_pred_single (preds:list[torch.Tensor], masks:torch.Tensor)
For predictions are for gaps only add buffer of Nan
so they have same shape of targets
buffer_pred (preds:list[list[torch.Tensor]], masks:torch.Tensor)
For predictions are for gaps only add buffer of Nan
so they have same shape of targets
soooooooo this is a problem!!! those should be the same
gap
tensor([[[ nan, -0.3666, 0.3765], [ nan, -0.0844, nan], [ 0.1625, nan, 0.3119], [-0.1102, -0.5474, -0.1901], [-0.1439, -0.4996, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, 0.4966], [ nan, 0.0202, nan], [ nan, nan, nan]], [[ nan, nan, nan], [ nan, nan, 0.2059], [ nan, nan, nan], [ nan, -0.3611, nan], [ 0.1032, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan]]], dtype=torch.float64)
no_gap
tensor([[[ 0.0322, -0.3666, 0.3765], [ 0.1475, -0.0844, 0.4320], [ 0.1625, -0.0796, 0.3119], [-0.1102, -0.5474, -0.1901], [-0.1439, -0.4996, 0.0158], [ 0.1736, -0.0500, 0.1638], [ 0.0998, -0.2115, 0.2026], [ 0.2642, 0.0727, 0.4966], [ 0.2667, 0.0202, 0.5999], [ 0.5621, 0.4854, 1.0911]], [[ 0.0523, -0.3558, 0.1773], [ 0.0572, -0.2074, 0.2059], [ 0.0361, -0.2809, 0.0575], [-0.0671, -0.3611, 0.1202], [ 0.1032, -0.2193, -0.1928], [-0.1298, -0.5874, -0.3787], [ 0.0450, -0.2554, -0.2029], [ 0.0938, -0.0932, 0.3782], [ 0.2229, -0.1440, -0.1133], [ 0.3722, 0.3017, 1.1004]]], dtype=torch.float64, grad_fn=)
tensor([[[ 0.0322, -0.3666, 0.3765],
[ 0.1475, -0.0844, 0.4320],
[ 0.1625, -0.0796, 0.3119],
[-0.1102, -0.5474, -0.1901],
[-0.1439, -0.4996, 0.0158],
[ 0.1736, -0.0500, 0.1638],
[ 0.0998, -0.2115, 0.2026],
[ 0.2642, 0.0727, 0.4966],
[ 0.2667, 0.0202, 0.5999],
[ 0.5621, 0.4854, 1.0911]],
[[ 0.0523, -0.3558, 0.1773],
[ 0.0572, -0.2074, 0.2059],
[ 0.0361, -0.2809, 0.0575],
[-0.0671, -0.3611, 0.1202],
[ 0.1032, -0.2193, -0.1928],
[-0.1298, -0.5874, -0.3787],
[ 0.0450, -0.2554, -0.2029],
[ 0.0938, -0.0932, 0.3782],
[ 0.2229, -0.1440, -0.1133],
[ 0.3722, 0.3017, 1.1004]]], dtype=torch.float64,
grad_fn=<SqueezeBackward3>)
tensor([[ 0.0322, -0.3666, 0.3765],
[ 0.1475, -0.0844, 0.4320],
[ 0.1625, -0.0796, 0.3119],
[-0.1102, -0.5474, -0.1901],
[-0.1439, -0.4996, 0.0158],
[ 0.2642, 0.0727, 0.4966],
[ 0.2667, 0.0202, 0.5999],
[ 0.0572, -0.2074, 0.2059],
[-0.0671, -0.3611, 0.1202],
[ 0.1032, -0.2193, -0.1928]], dtype=torch.float64,
grad_fn=<SqueezeBackward3>)
show_as_row(pred = buffer_pred(_masked2batch(pred_obs_gap.mean, mask), mask), mask = mask, all = pred_obs.mean)
pred
tensor([[[ nan, -0.3666, 0.3765], [ nan, -0.0844, nan], [ 0.1625, nan, 0.3119], [-0.1102, -0.5474, -0.1901], [-0.1439, -0.4996, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, 0.4966], [ nan, 0.0202, nan], [ nan, nan, nan]], [[ nan, nan, nan], [ nan, nan, 0.2059], [ nan, nan, nan], [ nan, -0.3611, nan], [ 0.1032, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan]]], dtype=torch.float64)
mask
tensor([[[ True, False, False], [ True, False, True], [False, True, False], [False, False, False], [False, False, True], [ True, True, True], [ True, True, True], [ True, True, False], [ True, False, True], [ True, True, True]], [[ True, True, True], [ True, True, False], [ True, True, True], [ True, False, True], [False, True, True], [ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True]]])
all
tensor([[[ 0.0322, -0.3666, 0.3765], [ 0.1475, -0.0844, 0.4320], [ 0.1625, -0.0796, 0.3119], [-0.1102, -0.5474, -0.1901], [-0.1439, -0.4996, 0.0158], [ 0.1736, -0.0500, 0.1638], [ 0.0998, -0.2115, 0.2026], [ 0.2642, 0.0727, 0.4966], [ 0.2667, 0.0202, 0.5999], [ 0.5621, 0.4854, 1.0911]], [[ 0.0523, -0.3558, 0.1773], [ 0.0572, -0.2074, 0.2059], [ 0.0361, -0.2809, 0.0575], [-0.0671, -0.3611, 0.1202], [ 0.1032, -0.2193, -0.1928], [-0.1298, -0.5874, -0.3787], [ 0.0450, -0.2554, -0.2029], [ 0.0938, -0.0932, 0.3782], [ 0.2229, -0.1440, -0.1133], [ 0.3722, 0.3017, 1.1004]]], dtype=torch.float64, grad_fn=)
pred
tensor([[[ nan, -0.3666, 0.3765], [ nan, -0.0844, nan], [ 0.1625, nan, 0.3119], [-0.1102, -0.5474, -0.1901], [-0.1439, -0.4996, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, 0.4966], [ nan, 0.0202, nan], [ nan, nan, nan]], [[ nan, nan, nan], [ nan, nan, 0.2059], [ nan, nan, nan], [ nan, -0.3611, nan], [ 0.1032, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan]]], dtype=torch.float64)
mask
tensor([[[ True, False, False], [ True, False, True], [False, True, False], [False, False, False], [False, False, True], [ True, True, True], [ True, True, True], [ True, True, False], [ True, False, True], [ True, True, True]], [[ True, True, True], [ True, True, False], [ True, True, True], [ True, False, True], [False, True, True], [ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True]]])
all
tensor([[[ 0.0322, -0.3666, 0.3765], [ 0.1475, -0.0844, 0.4320], [ 0.1625, -0.0796, 0.3119], [-0.1102, -0.5474, -0.1901], [-0.1439, -0.4996, 0.0158], [ 0.1736, -0.0500, 0.1638], [ 0.0998, -0.2115, 0.2026], [ 0.2642, 0.0727, 0.4966], [ 0.2667, 0.0202, 0.5999], [ 0.5621, 0.4854, 1.0911]], [[ 0.0523, -0.3558, 0.1773], [ 0.0572, -0.2074, 0.2059], [ 0.0361, -0.2809, 0.0575], [-0.0671, -0.3611, 0.1202], [ 0.1032, -0.2193, -0.1928], [-0.1298, -0.5874, -0.3787], [ 0.0450, -0.2554, -0.2029], [ 0.0938, -0.0932, 0.3782], [ 0.2229, -0.1440, -0.1133], [ 0.3722, 0.3017, 1.1004]]], dtype=torch.float64, grad_fn=)
no_conditional
tensor([[[ nan, -0.3666, 0.3765], [ nan, -0.0844, nan], [ 0.1625, nan, 0.3119], [-0.1102, -0.5474, -0.1901], [-0.1439, -0.4996, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, 0.4966], [ nan, 0.0202, nan], [ nan, nan, nan]], [[ nan, nan, nan], [ nan, nan, 0.2059], [ nan, nan, nan], [ nan, -0.3611, nan], [ 0.1032, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan]]], dtype=torch.float64)
conditional
tensor([[[ nan, 0.0292, 0.6235], [ nan, 0.4156, nan], [ 0.3672, nan, 0.7999], [-0.1102, -0.5474, -0.1901], [-0.0686, -0.1575, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, 0.4823], [ nan, 0.1720, nan], [ nan, nan, nan]], [[ nan, nan, nan], [ nan, nan, 0.5191], [ nan, nan, nan], [ nan, 0.2014, nan], [ 0.6513, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan], [ nan, nan, nan]]], dtype=torch.float64)
tensor([[ True, False, False],
[ True, False, True],
[False, True, False],
[False, False, False],
[False, False, True],
[ True, True, False],
[ True, False, True],
[ True, True, False],
[ True, False, True],
[False, True, True]])
tensor([[0.8775, nan, nan],
[0.6706, nan, 0.9272],
[ nan, 0.4967, nan],
[ nan, nan, nan],
[ nan, nan, 0.4760],
[0.9991, 0.1775, nan],
[0.6734, nan, 0.6468],
[0.3725, 0.2052, nan],
[0.5927, nan, 0.6441],
[ nan, 0.9132, 0.0329]], dtype=torch.float64)
tensor([[0.8775, nan, nan],
[0.6706, nan, 0.9272],
[ nan, 0.4967, nan],
[ nan, nan, nan],
[ nan, nan, 0.4760],
[0.9991, 0.1775, nan],
[0.6734, nan, 0.6468],
[0.3725, 0.2052, nan],
[0.5927, nan, 0.6441],
[ nan, 0.9132, 0.0329]], dtype=torch.float64)
so the standard filter is working better than the SR for the smoothing (with this parameter setting), so there is a way to make the sr smoother not that bad
But I just want to see how we have nan
So the problem is that we have a negative number on the diagonal … so is not positive definite and even the standard deviation is nan
Built-in mutable sequence.
If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.
Local slope models are an extentions of local level model that in the state variable keep track of also the slope
Given \(n\) as the number of dimensions of the observations
The transition matrix (A
) is:
\[A = \left[\begin{array}{cc}I & I \\ 0 & I\end{array}\right]\]
where:
the state \(x \in \mathbb{R}^{2N \times 1}\) where the upper half keep track of the level and the lower half of the slope. \(A \in \mathbb{R}^2N \times 2N\)
the observation matrix (H
) is:
\[H = \left[\begin{array}{cc}I & 0 \end{array}\right]\]
For the multivariate case the 1 are replaced with an identiy matrix
assuming that the control has the same dimensions of the observations then if we are doing a local slope model we have \(B \in \mathbb{R}^{state \times contr}\): \[ B = \begin{bmatrix} -I & I \\ 0 & 0 \end{bmatrix}\]
Built-in mutable sequence.
If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.